Association Rules Mining and Statistic Test Over Multiple Datasets on TCM Drug Pairs
نویسنده
چکیده
Objective: TCM drug pair is consisted of two and only two drugs, which is the smallest drug group following special drug compatibility regulations. Formulae compatibility regulations are one of the most important problems in TCM clinical practice and modern research but still not quite resolved. TCM drug pair was a very suitable objects to discovery the complicated formulae compatibility regulations. This paper applied association rules mining to study the structural characters of TCM drug pairs find some special relationships between drugs. This study might give some help to the research on the formulae compatibility regulations. Methods: We presented an enhanced association rules mining method to find out the property associations between two drugs in TCM drug pairs. And a binominal statistic test was introduced to get the statistical significance of rules mined. The property data from the 625 drug pairs containing 347 drugs were collected and analyzed. As most association rules mining run only in single database, the new method was proposed to find rules over multiple databases (2 in this paper standing for the two drugs in TCM drug pairs) based on a first Apriori algorithm mining. Then statistic test was applied to filter out insignificant rules furthermore. Results: Apriori algorithm and the new method were applied to mine association rules on TCM drug pairs for comparison. The rules found by Apriori method showed false high support, part of which came from the property associations within one drug but not between the two drugs in TCM drug pairs. And Apriori method could not found the association of replicated property, such as liver liver rules. The new method proposed could get the only associations between drugs even those replicated property rules. Some associations were mined with high supports and significances. Conclusion: This paper proposed an enhanced method to perform association rules mining over multiple databases. After comparison with Apriori algorithm the new method could just obtain the associations in which each item came from different database. The method was confirmed to be quite suitable on mining over multiple databases. The statistic test was also necessary to exclude false association rules.
منابع مشابه
Employing data mining to explore association rules in drug addicts
Drug addiction is a major social, economic, and hygienic challenge that impacts on all the community and needs serious threat. Available treatments are successful only in short-term unless underlying reasons making individuals prone to the phenomenon are not investigated. Nowadays, there are some treatment centers which have comprehensive information about addicted people. Therefore, given the ...
متن کاملA new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملAssociation Rules Mining on Heart Failure Differential Treatment Based on the Improved Firefly Algorithm
Research the heart failure medical cases of TCM (traditional Chinese medicine) to effectively mine the association rules of differential diagnosis and treatment. TCM medical cases are of vast amounts of data and strong relatedness, and a new and improved firefly algorithm based on the guide of normative knowledge has been proposed to overcome the shortcomings of traditional association rules mi...
متن کاملA Recent Survey on Incremental Temporal Association Rule Mining
88 Abstract— One of the most challenging areas in data mining is Association rule mining. Several algorithms have been developed to solve this problem. These algorithms work efficiently with static datasets. But if new records are added time to time to the datasets means if the datasets are incremental in nature, scenario of association rules may changed. Some of the new itemsets may become fre...
متن کاملOptimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...
متن کامل